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Abstract 


Overlapping fragments of genomic RNA spanning 6963 nucleotides from 5’ end of spike (S) protein gene to 3’ end of nucleocapsid (N) 
protein gene of turkey coronavirus (TCoV) were amplified by reverse-transcription-polymerase chain reaction (RT-PCR). The primers were 
derived from the corresponding sequences of infectious bronchitis virus (IBV). The PCR products were cloned and sequenced and their nucleic 
acid structure and similarity to published sequences of other coronaviruses were analyzed. Sequencing and subsequent analysis revealed 9 
open reading frames (ORFs) representing the entire S protein gene, tricistronic gene 3, membrane (M) protein gene, bicistronic gene 5, and N 
protein gene in the order of 5’—3’. The overall nucleic acid structures of these encoding regions of TCoV were very similar to the homologous 
regions of IBV. The consensus transcription-regulating sequence (TRS) of IBV, CT(T/G)AACAA, was highly conserved in TCoV genome 
at the levels of nucleotide sequence and location in regarding to the initiation codon of individual genes. Pair-wise comparison of gene 3, 
M gene, gene 5, or N gene sequences with their counterparts of IBV revealed high levels (82.1-92.0%) of similarity. Phylogenetic analysis 
based on the deduced amino acid sequences of S, M, or N protein demonstrated that TCoV was clustered within the same genomic lineage 
as the IBV strains while all the other mammalian coronaviruses were grouped into separate clusters corresponding to antigenic groups I or II. 


There were substantial differences of S protein sequence between TCoV and IBV with only 33.8-33.9% of similarity. 


© 2004 Elsevier B.V. All rights reserved. 
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1. Introduction 


Turkey coronavirus (TCoV) was identified in the early 
1970s as the major causative agent of the most costly disease 
of turkey encountered in Minnesota between 1951 and 1971 
(Nagaraja and Pomeroy, 1997). Outbreaks of turkey poult 
enteritis associated with TCoV have caused severe econom- 
ical losses in the turkey industry in Indiana, North Carolina, 
and other states for the last several years. Although the eco- 
nomical importance of this disease has been recognized for 
decades, the organization of genomic structure of TCoV is 
poorly understood and reports regarding the relationships of 
TCoV with other coronaviruses remained controversial (Van 
Regenmortel et al., 2000; Gonzalez et al., 2003). 
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Coronaviruses are pleomorphic, enveloped spherical 
particles surrounded by a fringe of 20nm long club- 
shaped spikes. The diameter of coronaviral particles are 
around 140-150 nm. The coronavirus genome is a positive 
single-stranded capped RNA with a polyadenylated 3’ 
end. Complete genomic RNA sequences of coronaviruses 
has been determined for infectious bronchitis virus (BV; 
27,569 nucleotides; Boursnell et al., 1987), murine hepatitis 
virus (MHV; 31,092 nucleotides; Lee et al., 1991), human 
coronavirus (HCoV) strain 229E (27,277 nucleotides; 
Herold et al., 1993), and transmissible gastroenteritis virus 
(TGEYV; 28,579 nucleotides; Eleouet et al., 1995; Penzes 
et al., 2001). The 5’ two-thirds of the coronavirus genome, 
approximately 20kb, consists of two overlapping open 
reading frames (ORFs) that encode non-structural proteins 
including the viral RNA-dependent RNA polymerase and 
proteases. Another one-third nucleotide sequences from 
3’ end contain ORFs encode the major structural proteins: 
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Fig. 1. (A) Schematic representation of turkey coronavirus genomic RNA, showing locations of coding regions for the spike (S) protein, gene 3, membrane (M) 
protein, gene 5, and nucleocapsid (N) protein. (B) Schematic representation of the location of primers used in the polymerase chain reaction (PCR), along with 
the intervening sequences (fragments I-III, or IV) amplified by the PCR. The sequences of primers: S-cor, tgaaaactgaacaaaagacagact; AS-cor, ccaaacataccaaggc- 
cactt; AS-corF, aagtggccttggtatgtttgg; MIBVR, gttcacacttagcaagccactg; MIBV, taagctttcagtggcttgctaagtgtgaacc; NIBV, tggatccaccgctaccttcaaacttgggcgc; Nup, 


tcttttgccatggcaag c; Ndown, tactcaaagttcattctc. 


spike (S), membrane (M), and nucleocapsid (N) proteins in 
the order of 5’-3’ along the genome, respectively. 

Turkey coronavirus was initially determined to be anti- 
genically distinct from all other coronaviruses based on anti- 
genic differences revealed by immunoelectron microscopy 
(Ritchie et al., 1973) and hemagglutination-inhibition (Dea 
et al., 1986). This unique antigenicity was questioned when 
the close relationship between TCoV and bovine coronavirus 
(BCoV) was demonstrated in a series of antigenic studies 
(Dea et al., 1990) and by sequence analysis of TCoV M and 
N genes (Verbeek and Tijssen, 1991). In contrast, recent anti- 
genic (Guy etal., 1997; Loa et al., 2000) and genomic (Breslin 
et al., 1999a,b; Akin et al., 2001; Cavanagh et al., 2001, 2002; 
Lin et al., 2002) analysis of TCoV, however, demonstrated 
that TCoV and IBV, two avian coronaviruses, are closely re- 
lated. The causes for these discrepant results regarding the 
relationships of TCoV with BCoV or IBV remained unclear. 
Further analysis of genomic structure of TCoV is important 
to clarify this enigma. Thus, the purpose of the present study 
was to determine the sequences of the 3’ end coding region 
for structural protein genes of TCoV. 


2. Materials and methods 


2.1. Turkey coronavirus 


The TCoV isolate (isolate 540) used in the present study 
were recovered from fecal contents and intestines of turkey 
poults with acute coronaviral enteritis in Indiana, US in 1994. 
The viruses were passaged 5 times in 22-day-old embryonat- 
ing turkey eggs. The presence of TCoV in the intestines of 
embryos were confirmed by TCoV-specific immunofluores- 
cence antibody assays and electron microscopy at the In- 
diana State Animal Disease Diagnostic Laboratory in West 
Lafayette, Indiana, US. 


2.2. RNA isolation and reverse transcription 


Total RNA was extracted from the intestines and intestinal 
content of turkey embryo infected with TCoV by a modified 


method using guanidinium thiocyanate and acid-phenol 
(Chomezynski and Sacchi, 1987; Akin et al., 1999). Con- 
version of total RNA to cDNA was essentially performed 
according to a protocol supplied by the manufacturer 
of the reverse transcriptase (Superscript II system, Life 
Technologies, Gaithersburg, MD). 


2.3. PCR amplification 


Three microliters of cDNA were used in PCR ampli- 
fications with the primers designed from IBV genomic 
sequences. The locations and sequences of primers for the 
amplification of 4 fragments I-IV for 3’ end coding region 
of TCoV structural protein genes are outlined in Fig. 1. PCR 
was performed with a mixture (64:1, v:v) of Taq (Promega 
Corp., Madison, WI) and Pfu polymerases (Stratagene, 
La Jolla, CA) in a 96-well thermal cycler (GeneAmp, 
Perkin-Elmer Cetus Corp., Norwalk, CT) (Barnes, 1994; 
Akin et al., 1999). The cyclic parameters of the PCR was as 
follows: 94°C for 1 min for denaturation, 37 °C for 2 min for 
annealing, and 72°C for 5 min for extension for 40 cycles 
followed by 72°C for 10 min for final extension. 


2.4. Molecular cloning and sequencing 


One microliter of the amplification product was used to 
ligate with pCR-II plasmid vector according to the man- 
ufacturer’s instructions (Invitrogen, San Diego, CA). De- 
termination of the nucleotide sequences of the selected 
clone with amplified sequences was performed by dideoxy- 
cycle sequencing method with the corresponding sequenc- 
ing primers for both strands (DAVIS Sequencing, Davis, 
CA). 


2.5. Sequence analysis 


The nucleotide and deduced amino acid sequences be- 
tween the TCoV and other coronaviruses were analyzed by 
DNAstar program (Lasergene Corp, Madison, WI), respec- 
tively. Percent similarities were calculated to find nucleic 
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Fig. 2. Nucleotide sequence of the amplified fragments containing entire spike (S) protein gene, gene 3, membrane (M) protein gene, gene 5, and nucleocapsid 
protein gene region of turkey coronavirus (TCoV) and their similarity to those of infectious bronchitis (IBV) strain Beaudette (GenBank accession number 
AJ311317). The positions where nucleotide bases are missing are indicated as (-) and identical nucleotides as (.). Heavy underlines below the sequence of 
TCovV indicate the putative start codons. Light lines above the sequence of TCoV indicate the stop codons. The conserved tanscription-regulating nucleotide 
sequence (A/C)T(T/G)AACAA, which is located upstream from the start codons of individual genes, is boxed. The start codon of IBV M protein gene is also 
underlined because it is at different position from that of TCoV M protein gene. 
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CTGAACTATTTGACCCCTTTGAAGTTTGTGTTTACAGAGGAGGTAAT TATTGGGAGTTAGAGTCAGCTGACGAGTTTTCA 
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gt ahs Pie Dine iGo WT GAS eo Tite cai ene oo ae Ay a ae oz he Ae Ge weg: GE ao and ogy oe es hE eS Shes, Gs, eh ES 


AACCATACTACTTCAGTATGGGTACGCAACTAGGAGTCGGGTTATTTACATACTGAAAATGATAGTGTTATGGTGCTTTT 
hs 8 Seay dma Ae ua Umtehe Sens a Earle De ee aA SA GAO Cope Mote a tik ot TEER. 96 ae tet A ARISES ba! fad ESAT Sea Ab Je 


GGCCCATTAACATTGCAGTAGGTGTAATTTCATGTGTATATCCACCAAATACAGGAGGTCTTGTCGCAGCGATAATACTT 
gaa Bae Coe ERO eh Nig BO Ee Sei oe aoe ee ete Oy ANC ee Ge ena 3 eee (Ce be ee wa tet Seo i ak te aR ee be Mae ee aad 


GCAACAGTTTTTCCTTTCTTTTGTTTGGAAGAAAGTTGTTGTTAATGGTGT AGAAT TCCAAGT AGAAAATGGAAAAGTCC 
Sane SiGe, - Gta, Aeon OG. 6 fetes GOS aan Ge ace tone eng s Goines, atts Seite matin ta Meee oy, ee 6 P e eseisit ated f T 


ACTACGAAGGAAATCCAATCTTCCAAAAAGGTTGTTGTAGTTTGTGGTCCGATTATAAGAAAGATT AGAATAATTAAGCC 
etary pape ata Se CA on Ge Dice Ae BR cia, BE eck a aa ES praca ek Bak Bo ete we FOSS gk ees (GRAS 


ATACGGACGATGAAATGGCTGACTAGT 


Bag Lge ly se oe Be Be ae ok eae Rea oodey 2G AS eT ote rains CTs A iCal oc AAS cy te Ee gee Sg he idee Gta yr Sy gies S Slat y Gee HE 


TT TGGAAGAGCGGTTGTTTCATGTTACAAAGCACTACTATTAACTCAACTTAGAGTGTTAGATAGGTTAATTTTAGATCA 
Uaioe tate sty anki si Ae Abs ste Pier Since: Spee A Bhp Bt SD Maia eee Meats lege ange: Whit gotta Te ah Elon bs wirg ty otek Setar dant shod ees de ade a apha dhcus ADP vhigfitha xo Mggeedth ah ody (aebth hate dd di gt 


CGGACCAAAACGCGTTTTAACGTGTAGTAGGCGAGTGCTTTTGTTTCAGTTAGATTTAGTTTATAGGTTGGCGTTTACGC 


tule ahaa PvG RSs «ular & Ab ek bo oA Ee ae ake 2 AGS Avs. baka, Bun, te oy BA Bah et oe ends Ee at, DA ot GS 
5b 

CCACCCAACCGCTGGTATGAATAA- - - - - - AGATAATCCTTTTCGCAGAGCAATAGCAAGAAAAGCGCGAATTTATCTGA 

Luge Bossy esin eG Diack oo ay ay de Oe ae a do ote ve AGT AAs 8 a nt hes a es a i GR Es ok de Wy DNS a es De g ata alee See 


N gene 
GAGAAGGATTAGATTGTGTTTACTTYCTTAACAAAGCAGGACAAGCAGATCCT TGCCCCGAGTGTACCTCTCTGGTATTC 


er ina he ea ete: Wns gees Gea dav dy WAI, gle, a BOL oe ERE, AGO lee Oe ge Psa ey hla oe Gite wei, wo Tes et oad as Reheat: SIA SS os 
N 
CGAGGGAAAATTTGTGAGGAACACATAAATAATAATAATCTTTTGTCATGGCAAGCGGTAAGGCTACTGGAAAAACAGAC 
Abs as Ree Ae | Ca a eee 8 eee Cerin Be Bh tect Mok gee eit s Deities ha Sd oe th otal hk ee hy ik ot A i GY es wee ty kor ae, Heck tote 


ihihs Goines a, Mul, Eeeine, Be OhaE! S Mo  e<otn woe a aie aes GE ars, Cs esate &, Aue Ae ie ay Glow LACS, Coa ead yore a Olde A a GA 8, ee Ea ae 
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TCoV GCCAGCAACATGGGTATTGGAGACGCCAAGCTAGGTTTAAGTCAAGT AGAGGAGGAAGAAAACCAGTCCCAGATGCCTGG _ 6003 
BGA: ~ pac aia Soy Wty ee oe Be ee ts A oe. fe ah aad ay: ok, Mae GAO ee an Cie. 38 ase acs Cis LG VOB fas a Me tt te ah sack ap oh Bu Mme des PME i Se A) A T... 5844 
TCoV TACTTCTATTATACTGGAACAGGACCAGCCGCTGACCTGCAATGGGGTGATAGCCAAGCGGGTATAGTGTGGGTTGCTGC 6083 
Beau... T Gis seek Ba: Beh Veeck, So ect oh Tia he Pod Aateee Ses Pre MO, 8S, Gon te te Re CY PAG fice, hoy sh oes Dit he ae Bh we, HB GS Be 5924 
TCoV AAAGGGTGCTGATGTTAAATCTAAATCTAACCAGGGT ACAAGGGACCCTGACAAGTTTGACCAATATCCACTACGTTTCT 6163 
Beau WP ie 3S see aa ke ela Bo PLO es a a adios G Cet Te ack aod, S Os $ A Dies 3 Maes chest hugtie era doa imitates Cee dey aes A 6004 
TCoV CTGATGGTGGACCTGATGGTAATTTCCGTTGGGATTTCATTCCTTTGAATCGTGGTAGGAGTGGAAGATCAACAGCAGTT 6243 
Beau pore, A oe Sons Sige Sate Seti a he Ry aS SS RSS Bw F OE, be og, MOON, oe ite ORY is, he Mea cana Gane a a AS aie SOP REESE GORE C. 6084 
TCoV TCATCAGCAGCATCTATTAGAGCACCATCACGT GAAGGT TCGCGTGGTCGTAGAAGTGGTGCTGAAGATGATCTTATTGC 6323 
Beale ase Bob wd boa we ae G Gio Goes ase Agoda goa ark Ga Sag Bw ale Roe be S Hae Rw aS EG a eG A. T (Ce Qos are vida ae 6164 
TCoV TCGCGCAGCAAAGATAATTCAGGATCAGCAGAAGAGGGGTTCTCGCATTACCAAGGCTAAGGCTGAAGAAATGGCCCATC 6403 
Beau won SD hs Eh eh de Ce i a, ty th Ra Bis Ai is, 6 Cle ee hw Wah Reo GS wk Als owls Poe, gle me: @. Bite sesh 4 Dawes a 10244 
TCoV GCCGGTACTGCAAGCGCACTGTCCCACCTGGTTATAAGGTTGAGCAAGTTTTTGGTCCCCGTACTAAAGGTAAGGAGGGA 6483 
Bea ogi Beh MR haa ord wee oe tenes Seri tnt eon, a AA. .... Ges wees WP dack, at Ors Goh. 6: be RA Ge ed Set se ee G 6324 
TCoV AATTTTGGTGATGACAAGATGAAT GAGGAAGGTATTAAGGATGGGCGTGTTACAGCAATGCTCAACCTAGTCCCTAGCAG 6563 
MBG AUS 585 uci Fan bps sal ate yan a, sit casa Rae ee gt UE SE wa oe She Et ag ade yn Rea Fn cam cas Ete Sah de ce aa Me Bn Nah gal tac Pat St ae bela Oibn sae Main as phe mane ee OE, We sah dy as, ae AS ee ar 6404 
TCoV CCATGCTTGTCTTTTTGCAAGTAAATTGACGCCCAAACTTCAACCAGATGGGCTGCACTTGAAATTTGAATTTACTACTG 6643 
Bea ahd ce ance he de wa We aes Gin a as ie (GS a an sae ca Sida, ae ata Ae 4 TDi. devia -w, Roa Tia tsk Beak. Geis Boa fen nse BADE Sie Bo 34 Ae ches 6484 
TCoV TGGTCCCACGTGATGATCCGCAGTTTGATAATTATGTTAGTATTTGTGATCAGTGTGTCGATGGTGTAGGAACACGTCCA 6723 
Beall ge eae Dee eGo: 2 Goto 4e ed ee, B Re ee Be LS Bes GINA. Boek eB ae ke BD Boe BOR A Ae oR Ba 8 we OR Ge eae a4 6564 
TCoV AAAGATGACGAACCGAGACCAAAGTCACGCGCAAGTTCAAGACCTGCTACAAGAGGAAATTCTCCAGCTCCAAGACAACA 6803 
Bea: eb ek ack Edie a Ad Ae fa. xtc eG at as Bas Me: oe sae seg eu: ade Ried Se aS a dg Seat ee de avd ae a ae Gi ei ae la Mk dee 6644 
TCoV GCGTCTAAAGAAGGAGAAAAAGCCAAAGAAGCAGGAT GAT GAAGTAGATAAAGCATT GACCTCAGAT GAGGAGAGGAACA _ 6883 
Beau Ci Coke cee be eee eS ee Ce apa Bie He a woe Bee GE eed ks aS ee es Sea ae des ed 6724 
TCoV ATGCACAGCTGGAATTTGATGATGAACCCAAGGTGATTAACTGGGGGGATTCAGCTCTAGGGGAGAATGAACTTTGAGTA 6963 
A) cy re ee TD oe Bt cae es Gia od we ace eis a Ae a ae MB) ee eH saan ow he ke Wd GE BD Be Ss eS 6804 


Fig. 2. (Continued ). 


acid and amino acid pair distances. Based on the obtained 
sequences of TCoV and previously published sequences 
of different coronaviruses, phylogenetic trees were con- 
structed according to the coding sequences for S, M, and N 


Table | 

Comparison of the 3’ end encoding regions between turkey coronavirus 
(TCoV) and infectious bronchitis virus (IBV) strain Beaudette (GenBank 
accession number AJ311317) 


anes Virus Gene ORF* size TRS? RS distance® 
g : (nucleotides) sequence (nucleotides) 
TCoV Spike 3612 ctgaacaa 52 
3. Results Gene 3 atgaacaa 23 
3a 174 
31 ] hen 3/ : 3b 195 
.1. Complete nucleotide sequences of 3’ end coding 3c 318 
region for structural protein genes of turkey coronavirus Membrane 669 cttaacaa 74 
Gene 5 cttaacaa 9 
Cloning and sequencing of the 4 overlapping fragments re- Sa 198 
: : : Baie 5b 243 
vealed a total of 6963 nucleotides in a region containing the : 
. a : . Nucleocapsid 1230 cttaacaa 93 
entire S protein gene, tricistronic gene 3, M protein gene, bi- 
cistronic gene 5, and N protein gene of TCoV in the present ey Spile eaey cia 52 
: ; Gene 3 ctgaacaa 23 
study. The primary structures of the coding sequences for 3a 174 
these genes of TCoV in the present study were very sim- 3b 195 
ilar to those found in the corresponding genomic regions 3c 330 
of IBV strain Beaudette as shown in Fig. 2 and Table 1. Membrane 678 cttaacaa 77 
The canonical consensus transcription-regulating sequence as 5 ‘ae haaeae 9 
(TRS) of IBV, CT(T/G)AACAA, was also found in TCoV 249 
in the present study. Both the nucleotide sequence of the Nucleocapsid 1230 cttaacaa 93 


TRS and the distance between the 3’ end of the TRS and 
the initiation codon of the downstream adjacent ORF were 
highly conserved between TCoV in the present study and 
IBV (Table 1). 


* ORF: open reading frame. 

> TRS: transcription-regulating sequence. 

© The distance is calculated as nucleotides between 3’ end of TRS and 
the ATG start codon of the corresponding first downstream ORF. 


Table 2 


Sequence pair distances for nucleic acid and deduced amino acid sequence of the entire spike protein gene region of turkey coronavirus (TCoV) with other coronaviruses 


Nucleotide identity (%) 


1 2 3 4 3 6 7 8 9 10 11 12 13 14 15 16 

1 TCoV* 100 95.1 95.4 51.7 52.5 52.1 42.9 41.8 41.8 38.3 42.6 42.8 41.1 38.8 41.3 39.3 
2 TCoV-Gh> 90.2 100 96.8 50.6 51.3 51.0 42.2 41.4 41.3 38.1 41.9 42.2 40.2 38.1 41.0 38.4 
3 TCoV-GI° 91.1 95.1 100 50.3 51.0 50.7 42.0 41.3 41.3 37.9 41.8 42.1 40.1 38.3 41.0 38.2 
4 IBV-Cu! 33.9 32.8 32.7 100 85.3 85.6 41.7 39.9 39.7 38.6 41.5 41.8 40.6 38.6 40.3 39.5 
5 IBV-KB* 33.8 33.1 32.7 83.2 100 94.4 42.6 40.7 40.6 38.1 42.2 42.7 41.1 39.3 41.2 40.6 
6 IBV-Beau! 33.9 33.1 32.7 84.6 94.0 100 42.8 40.7 40.7 38.1 42.3 42.9 41.6 39.3 41.1 40.4 
7 BCoVv® 22.3 21.8 21.8 20.4 21.3 20.7 100 38.2 38.6 33.2 93.9 98.7 67.1 36.1 38.7 38.6 
8 CCov? 25.4 24.0 24.5 23.8 23.2 23:3 21.3 100 90.4 44.5 35.5 3939 35.6 50.3 82.7 33.7 
9 FECoV' 25.2 23.9 24.4 24.0 23.3 23.5 21.3 93.1 100 44.1 36.3 36.2 36.1 50.9 84.3 33.1 
10 HCoV229E! 25.9 25.2 25.6 24.7 24.8 24.5 18.4 38.8 39.1 100 38.0 38.5 37.8 375 55.3 36.5 
11 HCoVvoc43* 22.3 21.8 21.7 20.3 21.2 20.7 91.9 20.2 20.1 21.0 100 94.1 67.0 36.2 38.6 38.2 
12 HECov! 22.3 21.6 21.7 20.3 21.2 20.6 97.9 20.0 19.8 21.4 92.2 100 67.0 36.2 38.6 38.5 
13 MHV™ 22.3 22.1 21.7 21.1 21.8 21.2 65.7 19.5 19.5 22.1 65.4 65.9 100 35.8 38.4 37.2 
14 PEDV" 26.4 25.4 25.9 25.1 25.4 25.5 19.8 42.6 43.3 49.7 20.2 20.2 20.9 100 52.6 32.8 
15 TGEV° 25.0 23.6 24.2 24.1 23:D 23.6 21.3 80.5 81.5 48.0 21.6 21.4 21.2 44.8 100 33.8 
16 SARSP 19.5 18.6 18.8 21.6 21.1 21.4 22.6 18.2 18.3 22.1 22.4 22.7 22.3 19.7 17.9 100 


Amino acid identity (%) 


® TCoV: a US, Indiana, isolate of TCoV. 
6 TCoV-Gh: an isolate of TCoV. GenBank accession number AY342356. 
© TCoV-GI: an isolate of TCoV. GenBank accession number AY342357. 


4 TBV-CU: a German strain, CU-T2, of infectious bronchitis virus (IBV). GenBank accession number U49858. 


© TIBV-KB: a Japanese strain, KB8523, of IBV. GenBank accession number M21515. 
 TBV-Beau: a US strain, Beaudette, of IBV. GenBank accession number AJ311317. 

& BCoV: bovine coronavirus. GenBank accession number M64668. 

4 CCoV: canine coronavirus. GenBank accession number X77047. 

i FECoV: feline enteric coronavirus. GenBank accession number X80799. 

J HCoV229E: human coronavirus strain 229E. GenBank accession number AF344186. 
k HCoVOC43: human coronavirus strain OC43. GenBank accession number L14643. 

' HECoV: human enteric coronavirus. GenBank accession number L07748. 
™ MHV: murine hepatitis coronavirus. GenBank accession number U72635. 

” PEDV: porcine epidemic diarrhea coronavirus. GenBank accession number Z25483. 
° TGEV: porcine transmissible gastroenteritis coronavirus. GenBank accession number AJ271965. 


P SARS: severe and acute respiratory syndrome coronavirus. GenBank accession number NC_004718. 


OL-19 (FOOT) GOT Yosvasay SNAIA /*]0 19 UIT "TL. 
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3.2. Sequence comparison and phylogenetic analysis 


Pair-wise comparison of nucleotide and deduced amino 
acid sequence distance between TCoV S protein gene in the 
present study and the homologous gene sequences of other 
known coronaviruses is summarized in Table 2. The simi- 
larity score between TCoV in the present study and other 
non-TCoV coronaviruses within the S protein gene region 
ranged from 38.3% to 52.5% at the nucleitide level or from 
19.5% to 33.9% at the amino acid level. 

The similarity scores between TCoV in the present study 
and IBV strains within the M or N protein gene region were 
high (>80%). In contrast, the similarity score between TCoV 
in the present study and other mammalian coronaviruses 
within the M or N protein gene region ranged from 24.8% to 
30.8% for nucleotide sequence and from 16.9% to 29.1% for 
deduced amino acid sequence. The tricistronic gene 3 with 3 
overlapping ORFs, 3a—c, in between S and M genes as well 
as the dicistronic gene 5 with 2 overlapping ORFs, 5a and 
5b, in between M and N genes of TCoV in the present study 
all shared high similarity with the corresponding genomic 
sequences of IBV strains. 

Phylogenetic analysis according to the deduced amino 
acid sequence of S, M, or N proteins indicated that TCoV in 
the present study was clustered within the same genomic lin- 
eage as the IBV strains while all the other mammalian coro- 
naviruses were grouped into separate clusters corresponding 
to antigenic groups I and II (data not shown). 


4. Discussion 


Total 6963 nucleotides of TCoV genome were cloned and 
sequenced in the present study. This region is likely to include 
all of the viral genes excluding the polymerase gene and, thus, 
provides substantial genetic information of the virus for com- 
parison with other coronaviruses. The genomic structures of 
ORFs for S protein, 3a—c, M protein, 5a—b, and N protein were 
very similar to those of IBV. The phylogenetic analysis based 
on the deduced amino acid sequences of S, M, or N protein 
all showed that the TCoV in the present study was classified 
within the same genomic lineage with IBV strains while all 
the other mammalian coronaviruses including BCoV were 
grouped to separate clusters. The nucleotide sequences of 
ORFs for 3a—c, M protein, 5a—b, and N protein of TCoV 
shared high similarity (82.1-92.0%) with the correspond- 
ing sequences of IBV. These results clearly demonstrated the 
close relationship of TCoV in the present study to avian IBV. 

The presence of tricistronic gene 3 in between 3’ end of 
S gene and 5’ end of M gene as well as the presence of di- 
cistronic gene 5 in between 3’ end of M gene and 5’ end of N 
gene are particular features of avian coronaviruses, TCoV and 
IBV. These particular genomic structures are not found in any 
other mammalian coronaviruses as determined to date. These 
distinct features of genome structure implied that TCoV 
shares a relatively close evolutionary relationship with IBV. 


The predicted proteins of ORF 3a—c, 5a and 5b were 
small (about or less than 10 kd). The functions of these 
gene products are not known. Several ORFs encoding non- 
structural proteins have been recognized in coronavirus 
genomes (Boursnelletal., 1987; Lee etal., 1991; Heroldetal., 
1993; Eleouet etal., 1995). The number, nucleotide sequence, 
and gene order of these ORFs varied remarkably among dif- 
ferent coronaviruses. It is speculated that these genes were 
inserted into different sites in the coronavirus genomes due 
to the RNA recombination-prone discontinuous transcrip- 
tion mechanism and were not essential for virus replication 
and pathogenesis. However, sequence analysis in the present 
study indicated that both nucleotide sequences and locations 
of these ORFs and their consensus TRS of TCoV are highly 
conserved with those of IBV. Given such a highly conserved 
sequences and structures within avian coronaviruses, genes 3 
and 5 may play important roles in coronavirus pathogenesis 
to avian species. 

One of the characteristic features for coronavirus repli- 
cation is the synthesis of a 3’ coterminal nested set of 
polycistronic subgenomic mRNAs by a discontinuous tran- 
scription mechanism. Several conserved TRS have been 
identified for different coronavirus proximal to the initi- 
ation codon of the first ORF for each particular subge- 
nomic mRNA. The consensus sequences of the TRS sites 
are CT(T/G)AACAA for IBV, ATC(T/C)AAAC for BCoV, 
AACTAAAC for TGEV, AATC(T/C)A(A/T)AC for MHV, 
and AACTAAAC for FIPV (Spaan et al., 1988; Stirrups 
et al., 2000). The distance between the TRS and the first ORF 
is different for each subgenomic mRNA of different coron- 
aviruses. Both the nucleotide sequence of TRS and the dis- 
tance between the TRS 3’ end and the initiation codon of first 
ORF are suggested to play important role in the transcription 
of mRNAs. As shown in the present study, the TRS sequences 
of TCoV were found highly conserved with those of the corre- 
sponding genes of IBV except one nucleotide substitution in 
that of gene 3. The TRS of gene 3 is ATGAACAA for TCoV 
and CTGAACAA for IBV. The distances between TRS and 
initiation codon of S gene, gene 3, gene 5, and N gene of 
TCoV were all the same as those of IBV while the distances 
for TCoV or IBV M gene are 74 or 77, respectively. The 
highly conserved sequences and structures of TRS between 
TCoV and IBV provide further evidence that these two avian 
coronaviruses share close evolutionary relationship. These 
highly conserved TRS sequences of IBV has been shown to 
be recombination “hot spot” and may serve as the template 
switching sites for the viral encoded RNA dependent RNA 
polymerase (Lee and Jackwood, 2000). These recombina- 
tion events play important role to the emergence of new IBV 
variants responsible for continuous outbreaks in the chicken 
flocks vaccinated with live attenuated viruses due to failure of 
cross protection. It is possible that the similar recombination 
events of IBV in chicken may contribute to the origin and 
evolution of TCoV in turkey and merit further investigation. 

Even though the close genetic relationship between TCoV 
and IBV was clearly demonstrated as discussed above, 
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these two avian coronaviruses are dramatically different at 
the S protein gene level. The similarity of S protein se- 
quences between TCoV in the present study and IBV strains 
(33.8-33.9%) is much lower than that among IBV strains 
(83.2-94.0%). The difference of nucleotide between TCoV 
in the present study and IBV seems to be randomly distributed 
throughout the entire S gene except a stretch of 225 nu- 
cleotides from the 3’ end that shared high similarity (88.9%) 
with the corresponding sequences of IBV. These observations 
suggested that cross-over homologous recombination, very 
likely by a template switching mechanism, occurred around 
the consensus TRS site of S gene and within the 3’ end 225 
nucleotides region (involving the TRS site of gene 3) and 
resulted in a whole new codon reading frame for S protein 
of TCoV with conserved TRS and other genomic structure 
features of IBV. Spike protein of coronaviruses has been well 
known as the major structural protein responsible for attach- 
ment, fusion, and penetration of virions to the target cells. 
The substantial difference of S protein gene between TCoV 
and IBV well explains the different host tropism and different 
tissue pathogenicity of these two avian coronaviruses. Turkey 
coronavirus is associated with enteric disease of turkey while 
IBV is usually associated with respiratory disease in chicken. 

Two group-specific monoclonal antibodies, which reacted 
with a broad spectrum of homologous and heterologous IBV 
serotypes, were tested for reactivity with TCoV in a previous 
study (Loa et al., 2000). The antibody specific to M protein 
(Mab 919) of IBV had strong cross reactivity with TCoV but 
the antibody specific to S protein (Mab 94) of IBV did not 
react with TCoV. In line with these previous observations of 
antigenicity, the sequence analysis in the present study re- 
vealed a high homology of M protein gene between TCoV 
in the present study and IBV. On the other hand, the differ- 
ence of S protein gene between TCoV in the present study 
and IBV is substantial. Therefore, molecular diagnostic as- 
say or antigenic analysis using antibody specific to S pro- 
tein or gene will be useful tools to differentiate TCoV from 
IBV. 

The results of sequence analysis in the present study stress 
the close relationship of TCoV to IBV. Coronavirus genomes 
are dynamic with high frequency of recombination, insertion, 
and deletion, subsequently, may result in significant genetic 
differences. Further cloning and sequencing analysis of full- 
length genomic sequences of more TCoV isolates are under 
way for revealing a faithful picture of the TCoV genome. 
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